Concerted transformation of a hyper-paused transcription complex and its reinforcing protein

RfaH, a paralog of the universally conserved NusG, binds to RNA polymerases (RNAP) and ribosomes to activate expression of virulence genes. In free, autoinhibited RfaH, an α-helical KOW domain sequesters the RNAP-binding site. Upon recruitment to RNAP paused at an ops site, KOW is released and refolds into a β-barrel, which binds the ribosome. Here, we report structures of ops-paused transcription elongation complexes alone and bound to the autoinhibited and activated RfaH, which reveal swiveled, pre-translocated pause states stabilized by an ops hairpin in the non-template DNA. Autoinhibited RfaH binds and twists the ops hairpin, expanding the RNA:DNA hybrid to 11 base pairs and triggering the KOW release. Once activated, RfaH hyper-stabilizes the pause, which thus requires anti-backtracking factors for escape. Our results suggest that the entire RfaH cycle is solely determined by the ops and RfaH sequences and provide insights into mechanisms of recruitment and metamorphosis of NusG homologs across all life.


SUMMARY
RfaH, a paralog of the universally conserved NusG, binds to RNA polymerases (RNAP) and ribosomes to activate expression of virulence genes.In free, autoinhibited RfaH, an α-helical KOW domain sequesters the RNAP-binding site.Upon recruitment to RNAP paused at an ops site, KOW is released and refolds into a β-barrel, which binds the ribosome.Our structures of ops-paused transcription elongation complexes alone and bound to the autoinhibited and activated RfaH reveal swiveled, pre-translocated pause states stabilized by an ops hairpin in the non-template DNA.Autoinhibited RfaH binds and twists the ops hairpin, expanding the RNA:DNA hybrid to 11 base pairs and triggering the KOW release.Once activated, RfaH hyper-stabilizes the pause, which thus requires anti-backtracking factors for escape.Our results suggest that the entire RfaH cycle is solely determined by the ops and RfaH sequences and provide insights into mechanisms of recruitment and metamorphosis of NusG homologs across all life.

HIGHLIGHTS
-The nontemplate DNA strand of an ops-paused transcription complex forms a hairpin -Autoinhibited RfaH binds and twists the ops hairpin to expand the RNA:DNA hybrid -RfaH-hairpin contacts are solely responsible for triggering RfaH activation -Upon recruitment, RfaH hyper-stabilizes the pause and promotes RNAP backtracking
Many cellular genomes encode specialized paralogs of NusG 21 , whose roles are best characterized in bacteria, where they are required for biosynthesis of capsules 22,23 , toxins and adhesins 24,25 , antibiotics 26,27 , and lipopolysaccharides 28,29 .Many NusG paralogs are encoded on conjugative plasmids, from F-factor 30 to multidrug-resistant clinical isolates 31 .Thus, NusG paralogs are vital for bacterial fitness and pathogenesis and may also facilitate evolution 32 .
NusG paralogs function alongside NusG and have just a few targets, which become critical in some conditions, e.g., during Klebsiella pneumoniae lung infection 22 .The regulatory logic that underpins the division of labor among NusG-like proteins is well understood in Escherichia coli.NusG is an abundant protein that dynamically interacts with almost every transcribing RNAP 3 and determines the fate of the nascent RNA by either suppressing or promoting Rhodependent termination.On translated mRNAs, NusG can bridge RNAP to the leading ribosome 19,20 whereas on rRNA, NusG is part of an antitermination complex 6 ; in both cases, the nascent RNA and RNAP are shielded from Rho.On antisense, aberrant, and xenogeneic RNAs, NusG KOW binds to Rho to induce premature termination [33][34][35] .Xenogeneic silencing is an essential function of E. coli NusG 36 .
Conversely, expression of several virulence and housekeeping operons critically depends on NusG paralog RfaH 37 .RfaH, but not NusG, is associated with RNAP transcribing these xenogeneic operons 38 , even though NusG vastly outnumbers RfaH in the cell 39 .A combination of sequence-specific recruitment and fold-switching-controlled autoinhibition ensures that RfaH finds its targets while not compromising the essential function of NusG 32 .
In free RfaH, the RNAP-binding site on NGN is masked by its KOW domain adopting an αhelical hairpin (KOW α ) 11 .RfaH recruitment requires a 12-nucleotide (nt) operon polarity suppressor (ops) sequence (Figure 1) present in 5'-untranslated regions of RfaH target operons 28 .The ops element halts RNAP to allow time for RfaH recruitment 40 and makes direct contacts to RfaH NGN 12,41 .Upon binding to the ops-paused elongation complex (opsPEC), RfaH is activated through domain dissociation 42 .The released NGN accommodates on RNAP and converts the enzyme into a pause-resistant state 12 , while the freed KOW refolds into a NusG-type five-stranded β-barrel (KOW β ) and binds to the ribosomal protein S10 42 to couple transcription to translation and activate protein synthesis 43 .Notably, RfaH recruitment to RNAP and the ribosome must be tightly orchestrated: ops is the only chance for RfaH to load onto transcribing RNAP, and the ribosome must be captured by RfaH between ops and the start codon, located within 100 nts downstream.Failure of either recruitment cripples gene expression by hundreds of folds 37,43 .
Understanding such synchronicity requires elucidation of minute structural detail.While much insight has been provided by existing structures of autoinhibited and activated RfaH (in isolation, bound to ops DNA, and bound to an ops-modified non-paused EC 11,12,41,42 ), their limitations leave several key questions unanswered.First, what features of opsPEC render it exceptionally efficient at recruiting RfaH, which is present at fewer than 100 copies per cell?Second, in the autoinhibited RfaH, the RNAP-binding site is partially occluded -how does RfaH bind to opsPEC?Third, how is RfaH domain dissociation triggered upon binding to opsPEC?Fourth, after accommodation of RfaH, how are ribosome recruitment and pause escape achieved?In this work, we sought to answer these questions by comparing structures of opsPECs alone and bound to the autoinhibited and activated RfaH.We show that the formation of an ops hairpin (opsHP) in the non-template (NT) DNA strand stabilizes a swiveled, pre-translocated paused state with an extended bubble and a 10-base pair (bp) RNA:DNA hybrid.Using the opsHP as a handle, autoinhibited RfaH docks onto opsPEC near its final binding site, priming the KOW for activation.RfaH grabs and twists the opsHP, further expanding the bubble and the hybrid to 11 base pair (bp) and thus hyper-stabilizing a pause that requires accessory factors for escape.Finally, our molecular dynamics simulations suggest that, following the initial domain separation, the RNAP-RfaH contacts are dispensable for the KOW α-to-β fold-switch.Together, our findings illuminate the molecular details underlying a paradigmatic, multi-step process of deoxyriboregulation of RNAP.

RNAP pauses at the ops element in a pre-translocated state
To enable RfaH recruitment, RNAP (a five-subunit α2ββ'ω complex in E. coli) pauses at the ops site, yielding a pre-translocated PEC 44 .A previous analysis revealed the structure of a non-paused EC bound to RfaH (RfaH-opsEC) 12 .In that study, the EC was assembled on a scaffold (Figures 1B and S2A) with the ops element in the NT DNA, which is sufficient for RfaH binding 40 , and a template (T) DNA strand that lacked complementarity in the last 10 ops nucleotides, most notably the 3'-terminal G12 in the NT DNA (G12; thereafter, all nucleotides are numbered to reflect their positions in the ops element, with RNA and T-DNA nts denoted with an R/T superscript), thereby favoring the separation of G12 from the T DNA and the adoption of a post-translocated state.Furthermore, the RNA sequence allowed only a 9-bp hybrid, whereas longer hybrids may occur during ops-mediated and RfaH-stabilized pausing.
Finally, the scaffold had a short 6-bp upstream DNA duplex, possibly precluding nucleic acid repositioning via RNAP or factor contacts to more distal DNA regions.Thus, although the structure of RfaH-opsEC captured the molecular details of RfaH interactions with RNAP and DNA, the molecular basis of initial ops-mediated pausing is presently unknown.
Here, we assembled an EC on a fully complementary scaffold harboring the ops site, an extended upstream duplex (up to 14 bps), and an RNA that could form a hybrid of up to 11 bps at the expense of the proximal upstream DNA melting (Figure 1B).We then elucidated the atomic structure of thus assembled opsPEC by cryogenic electron microscopy (cryoEM) in combination with single-particle analysis (SPA) (Figures 1 and S1, Table S1).The particles were highly homogeneous and in the cryoEM reconstruction, almost all regions of RNAP, all DNA nts and 10 nts of RNA were well resolved, with fragmented density for RNA outside of the RNA exit channel.
A PEC formed at ops is biochemically distinct from a PEC stabilized by an RNA hairpin, e.g., hisPEC 44 .Consistently, while in structures of hisPECs, the RNA:DNA hybrid adopted a half-translocated state 12,45 (RNA post-translocated, DNA pre-translocated), our opsPEC resides in the pre-translocated state (Figure 2A).RNAP is paused at U11, and no unpaired T-DNA nucleotide is present in the i+1 position to receive an incoming rNTP.Compared to an elongation-competent, post-translocated EC 46 , hisPECs adopt a swiveled state, in which a so-called swivel module (clamp, dock, shelf, SI3, and a C-terminal segment of the β' subunit) is rotated by ~3 ° about an axis perpendicular to the plane defined by the axes of the upstream DNA duplex and the RNA:DNA hybrid 12,45 .Swiveling is thought to stabilize the paused state by counteracting folding of the β' trigger loop (TL), which is required for nucleotide addition 12,45 .Consistent with the exceptionally strong pausing at ops 47 , opsPEC undergoes particularly pronounced swiveling of 5.8° (Figure 2B).As expected for a swiveled state, the TL is unfolded (Figure 2C) and the β' SI3 domain is in the open conformation.In opsPEC, but not in EC 46 , the TL β'R933 forms salt bridges with βE546 and βD549, an interaction that may stabilize the unfolded TL (Figure 2C).
In contrast to all other structures of factor-free ECs reported so far, the NT strand is fully defined in the cryoEM reconstruction (Figure 2A).NT bases G2-C9 form a hairpin that rests on top of the R314 side chain of the β' rudder (Figures 2D and 2E).The stem of the opsHP comprises two canonical Watson-Crick (WC) bps (G2:C9 and C3:G8) and a Saenger XI bp (G4:A7) and is stabilized by positively charged residues of the β lobe (R201, R371, R394), β protrusion (R470, R473), and β' rudder (K321); G1 forms the most proximal bp of the upstream DNA duplex (Figure 2E).Accommodation of the opsHP at the β lobe/protrusion pushes the upstream DNA duplex away from the β protrusion and against the β' zipper and the β' clamp helices (CH), effectively promoting swiveling (Figures 2B and 2F).Consequently, the upstream and downstream DNA duplexes span an angle of 102 °, vs ~129 ° in the canonical EC (Figures 2A and S2B).
The two-nt opsHP loop (G5-T6) is required for specific recognition by RfaH 12,41 .In opsPEC, this loop is located between the upstream DNA channel and the main channel of RNAP (Figure 2F).The T6 base is completely flipped outwards and is highly flexible, as indicated by the lack of cryoEM density (Figure 2A), and G5 is also rotated outwards.The opsHP-induced displacement of the upstream DNA and swiveling lead to an increased solvent accessibility of the opsHP loop, presenting the nucleobases of G5 and T6 for sequence-specific readout.
Immediately downstream of the opsHP, the NT DNA changes direction; G10 and T11 move away from the transcription bubble and form an extended stack with βW183 that is laterally stabilized by βD199 and βR200 (β lobe; Figures 2D and 2E).G12 is embedded in the core recognition element (CRE) pocket 48 , between βR151, βI445, βD446, βR451, βL538, and βV547 (β lobe, β protrusion, and following residues; Figure 2G).Thus, despite the presence of a complementary C12 T , sequestration of G12 at the CRE prevents it from pairing with C12 T ; instead, the following C13 is diverted to form the first bp of the downstream DNA duplex (Figure 2G).
In the canonical pre-translocated EC, 10 DNA bps are melted to form a 10-bp RNA:DNA hybrid 49 .Formation of the opsHP requires 11 single-stranded (ss) nts in the NT strand -thus, an additional upstream DNA bp is melted and the last unpaired T-strand nt, C12 T , remains stacked on the downstream duplex and does not move into the templating i+1 position (Figure 2G).Consequently, the DNA strands in the bubble are compressed as compared to canonical ECs, causing a sharper angle between the upstream and downstream duplexes.
The side chain of βR542 occupies the position of G12 in the duplex, engaging the Watson-Crick face of C12 T (Figure 2G).This interaction may stabilize the pre-translocated state or promote local melting of downstream DNA to facilitate pause escape.To decipher the βR542 contribution, we substituted it for an alanine and characterized the pausing behavior of wildtype (wt) and β R542A RNAP variants in vitro.We used a template that contains a strong T7 A1 promoter followed by the ops and his pause elements 11 .On this template, RNAP can be halted at ops G5 in the absence of UTP; thus synchronized, α 32 P-labeled G5 ECs are restarted upon the addition of all NTPs.RNAP strongly pauses at the ops (U11) and his pause sites before making the run-off (RO) RNA.The wt RNAP paused at U11 with a half-life of 8 seconds, whereas βR542A substitution delayed escape ~2.5 fold (Figure 2H), suggesting that βR542 promotes escape from the ops pause, in contrast to its effect at the elemental pause 50 .The flexible sidechain of βR542 can interact with the edge of the downstream DNA (e.g., PDB IDs 5VOI, 7YPA, 8FVW) or NT DNA -1/+1 nts (e.g., PDB IDs 8EG7, 8EG8, 8EH8).Accordingly, it is conceivable that βR542 effects on pause escape may differ depending on the sequence context.In context of opsPEC, βR542 presumably functions as a catalyst of DNA separation by temporarily replacing base pairing with the protein-T DNA interaction.
To evaluate the role of residues that appear to stabilize the opsHP, we substituted arginine residues in the β lobe (R371A/R394A), β protrusion (R470A/R473A), or β' rudder (R314A) and characterized pausing of the RNAP variants in vitro.We found that neither substitution had significant effects on pausing at ops (Figure 2H), as was also observed with base substitutions that destabilized the opsHP stem 41 .
In order to assess the basis of the extension of ss DNA regions in the transcription bubble, we determined a 3.0 Å cryoEM/SPA structure of an opsPEC assembled on a partially noncomplementary scaffold (Figure S2C, Table S2).The nc-opsPEC, which was virtually identical to opsPEC assembled on the complementary (c) scaffold (Figures S2D-S2H).Thus, the extended ss regions in the opsPEC form independently of the precise sequences forming the bubble and the hybrid, suggesting that this extension principally depends on the NT-DNA HP.

RfaH recruitment proceeds via a hyper-paused encounter complex
In autoinhibited RfaH, the KOW α masks the β'CH-binding site on NGN 11 , yet NMR data show that autoinhibited RfaH binds RNAP near its final binding site 42 , suggesting the existence of a transient encounter complex, opsPEC Enc .To image this complex, we used a F51C,S139C RfaH variant (RfaH CC ) locked in the autoinhibited state by a disulfide bridge between NGN and KOW α (Figure 3A).We previously showed that RfaH CC is fully active under reducing conditions 11 , and far-UV CD and 2D [ 1 H, 15 N]-HSQC spectra confirmed that RfaH CC adopted the expected states under non-reducing and reducing conditions, while analytical gel filtration ensured the homogeneity of the sample (Figure S3).
We mixed RfaH CC with the assembled opsPEC under non-reducing conditions and determined the structure of the ensuing opsPEC Enc by cryoEM/SPA at 2.6 Å resolution (Figure 1A, Table S1).Except for the β' zinc binding domain (ZBD), which could not be reliably modeled, most RNAP elements and the nucleic acid regions were well defined in the cryoEM reconstruction.In addition, there was a clear density for RfaH CC .
The opsPEC Enc is swiveled, albeit less strongly than opsPEC (4.9 ° vs. 5.8 °, Figure S4A), with an unfolded TL and an open SI3.Correlating with reduced swiveling, the β'CH are slightly displaced from the β lobe/protrusion, and the angle between upstream and downstream DNA duplexes is increased compared to opsPEC (~116 °) but remains smaller than in the canonical EC (Figure 3B).Therefore, sufficient space exists between the β lobe/protrusion and β'CH for RfaH CC NGN to bind the opsHP and RNAP (Figures 3B and 3C).However, the KOW α -NGN interaction prevents full accommodation of RfaH as observed in RfaH-opsEC 12 .
In RfaH-opsEC 12 , RfaH is bound in an open conformation; NGN is positioned across the RNAP main channel, with helices α1 and α2 contacting the β protrusion and β lobe, respectively, and the opposite open flank of the central β-sheet and helix α3 contacting the β'CH.In opsPEC Enc , the NGN packs with helix α2 against the β lobe similar to RfaH-opsEC, whereas the NGN is rotated about the α2 axis towards the β protrusion (Figures 3C, S4B and S4C).NGN helix α1 is moved closer to, and interacts more intimately with, the β protrusion, while on the other flank, the loop preceding NGN helix α3 is displaced from the β lobe and instead interacts with the β' clamp region neighboring the β'CH.Helix α3 is thereby moved away from the β'CH, and its rough position is instead occupied by the α2* helix of KOW α (*, secondary structure elements in KOW; Figures 3C, 3D and S4B).
G5 and T6 in the opsHP loop serve as crucial anchors during RfaH recruitment 41 .In full agreement, in opsPEC Enc , the opsHP stem has the same base pairing pattern as in opsPEC and its tip is engaged by NGN (Figures 3E, 3F, S4C).The NGN-opsHP interaction closely resembles the crystal structure of the RfaH-ops binary complex 41 ; i.e., the opsHP is bound at NGN opposite KOW α , T6 is inserted into a positively charged pocket of NGN, and G5 packs against a neighboring surface (Figures 3D and S4D).Nucleobase-specific interactions between NGN and the opsHP loop (Figure 3E) show that RfaH reads out the NT-DNA sequence already in opsPEC Enc .T6 not only contacts NGN but it also stacks on βY62 (β protrusion; Figure S4C), an interaction not observed in RfaH-opsEC 12 .Additionally, the same positively charged RNAP side chains as in opsPEC contact the opsHP stem, but at different chemical functionalities or nts.As a result, T6 is snuggly sandwiched between NGN and RNAP, and the entire opsHP is tightly restrained, as indicated by its well-defined density (Figures 3B and 3F).
Upon NGN grabbing a hold of the opsHP loop, the opsHP is repositioned, with profound consequences for the paused state.Docked RfaH CC twists the opsHP and redirects its tip towards the upstream DNA, while G10 unstacks from T11, which retains stacking interactions with βW183 (Figures 3G, 3H and S5C).Furthermore, Y8 and R11 in the loop preceding helix α1 of RfaH CC contact the sugar-phosphate backbone of upstream DNA on the major groove side (Figure 3D); based on these DNA "anchors", the proximal T-DNA branch of the upstream duplex is pulled against the β' rudder as RfaH CC engages the opsHP (Figure 3H).The β' rudder thereby acts like a strand separator, displacing C1 T from G1 to melt an additional upstream DNA bp (Figures 3G-3I).A β-hairpin loop (HL; M32-L50) of NGN is well-defined in opsPEC Enc , and R40 at the HL tip contacts the more distal upstream DNA backbone (Figures 3D and 3E).
In opsEC-RfaH, the HL is disordered, possibly due to the short upstream DNA duplex employed 12 .
Upstream DNA melting leaves G1 unpaired, and its nucleobase stacks with the neighboring upstream nt (G0); the β'CH residues (β'R271, β'N274) stabilize G1 and the new most proximal bp of the upstream DNA duplex (β'R270).Most importantly, upon RfaH CC -induced repositioning of the opsHP and upstream DNA melting, the liberated C1 T is paired with G1 R , extending the RNA:DNA hybrid to 11 bps (Figures 3G-3I).As in opsPEC, β'L255 and β'R259 (β' lid) cap the upstream edge of the hybrid, leading to its further compression, an increase in diameter by ~1 Å, and, thus, a more A-like conformation compared to the 10-bp hybrid in opsPEC (Figures 3H and 3I).The hybrid remains pre-translocated and its downstream end matches the conformation observed in opsPEC, but density for the most downstream bp is less well defined (Figure 3B).The hybrid is thus pushed backward due to the destabilization of its downstream end, further counteracting translocation.Thus, pausing not only persists but is reinforced during initial RfaH recruitment, changes that might be required to allow RfaH activation and full accommodation.
We wondered how strongly the initial docking of autoinhibited RfaH onto the opsHP drives upstream DNA melting and, consequently, hybrid expansion.Therefore, we assembled an EC on a scaffold in which a WC bp in the upstream DNA would have to be disrupted and formation of a non-WC C:U pair would have to be "forced" to generate the 11-bp hybrid (Figure S5).The 3.1 Å cryoEM/SPA-based structure revealed that the ensuing RfaH CC -bound complex was virtually identical to opsPEC Enc .Strikingly, even though the C:U bp in the hybrid does not energetically fully compensate for the lost bp in the upstream DNA, the hybrid was extended to 11 bp (Figures S5E-S5G).We conclude that docking of RfaH provides a strong driving force for upstream DNA melting and hybrid expansion, leading to a hyper-paused opsPEC Enc .
The overall conformation of RfaH CC in opsPEC Enc closely resembles the structure of isolated, autoinhibited RfaH 11 , but embedding of RfaH CC between the β lobe, β protrusion and β'CH leads to a slight displacement of KOW α relative to NGN (Figures 3D, S4B, S4E).While full displacement of KOW α is prevented by the disulfide bridge, the β'CH tip acts like a wedge that starts to insert between NGN and KOW α (Figures 3D and S4B).Concomitantly, KOW α helices α1* and α2* are unwound by two N-terminal turns and one C-terminal turn, respectively (Figure S4E).Thus, upon initial docking to opsPEC, autoinhibited RfaH takes a handle of the opsHP loop to pull its own KOW α against the β'CH, generating steric conflicts that prime KOW α dissociation and subsequent refolding, and thus RfaH activation.

The hyper-paused state persists after full accommodation of RfaH
To follow the complete RfaH recruitment and activation, we determined cryoEM/SPA structures of opsPECs assembled on c-and nc-scaffolds with RfaH wt (Figures 1, 4A, S6; Tables S1 and S2).To facilitate possible conformational changes, we incubated the samples at 37 °C for 10 minutes before vitrification and imaging.CryoEM reconstructions followed by 3D variability analysis (3DVA) revealed two nearly identical states (Figure 4A).We focus on complex 1 assembled on the c-scaffold in which both RfaH domains are visible (Figure 4A, left; thereafter designated as opsPEC Rec ) and discuss the alternative state below.
RfaH is bound in its activated state (Figure 4B), with NGN fully embedded between the β lobe, β protrusion and β'CH, as observed in RfaH-opsEC 12 .As compared to opsPEC Enc , NGN is rotated about the axis of helix α2, so that α1 is slightly displaced from the β protrusion, while α3 moves to contact the β'CH (Figure 4C).In contrast to RfaH-opsEC, opsPEC Rec retains a swiveled conformation, albeit with a reduced swiveling angle of 3.3 ° as compared to opsPEC Enc (4.9 °).Thus, RfaH binding per se does not prevent swiveling as previously suggested 12 .
The opsHP conformation and position are essentially unaltered compared to opsPEC Enc and very similar to RfaH-opsEC (Figures 4B vs 3B).However, the loop preceding RfaH helix α1 is moved closer to the upstream DNA and engages in more intimate interactions with the sugar-phosphate backbone on the major groove side; the HL-upstream DNA contacts are also maintained, albeit to a more proximal region of the upstream duplex (Figure 4C).We observed clear density for the refolded KOW β , which is positioned on top of the HL and contacts the β'ZBD, which becomes ordered (Figures 4B and 4C).KOW β residues E124 and F126 in the loop connecting β1* and β2* sandwich the β'ZBD K87 and the upstream DNA is displaced towards the β protrusion and β flap (Figures 4C and 4D).In RfaH-opsEC, KOW β was observed in a similar location 12 , also interacting with the β flap tip helix (FTH).In opsPEC Rec , in contrast, the βFTH remains disordered, possibly because the longer upstream DNA we employed prevents close approach to KOW β .Notably, the binding site for S10 is exposed in KOW β .
In opsPEC Rec the upstream duplex is pushed against the β' rudder, maintaining the additional melted bp, and the hybrid is compressed and pre-translocated.The same configuration is also adopted in opsPEC Rec assembled on the nc-scaffold, showing that the strong driving force for upstream DNA melting and hybrid expansion is maintained in opsPEC Rec .In summary, opsPEC Rec remains hyper-paused, with KOW β poised to engage a ribosome to form an RfaH-bridged expressome.

Refolding landscape of RfaH upon recruitment to opsPEC
Ribosomal interactions with its KOW β element are critical for the cellular function of RfaH 43 .
Thus, the KOW fold-switch is the final step in RfaH activation.Our structures capture the autoinhibited and activated states of RfaH-bound opsPECs but provide no information about their interconversion.Molecular dynamics (MD) simulations have been used to interrogate the KOW switch 51 , but these simulations were done with the isolated RfaH or KOW.To explore the fold-switch of RfaH bound to opsPEC, we generated an all-atom dual-basin structurebased model (SBM), such as those employed to study massive structural transitions of influenza hemagglutinin 52 and SARS-CoV-2 spike protein 53 .In 500 independent runs performed using a dual-basin SBM created based on opsPEC Enc and opsPEC Rec , KOW underwent a complete α-to-β fold-switch.
Figure 5 shows the refolding landscape of RfaH projected onto the fraction of interdomain (ID) contacts (QID) and the difference in the fraction of native contacts formed with respect to either KOW α or KOW β (Qdiff).To easily visualize KOW β when QID = 0, we employed the distance between the NGN and KOW instead.RfaH fold-switch (Figure 5A) requires that at least 60% of the ID contacts are broken (Figure 5B).Following the fraction of formed native contacts for each KOW fold (Figure 5C) reveals that RfaH refolding is rugged, with at least four intermediate states, I1-I4.I1 and I2 are connected to the KOW α basin, whereas I3 and I4 have a higher fraction of KOW β -like native contacts (Figure 5D).Refolding of RfaH starts by the loss of native contacts at the N-terminus of the KOW α1* helix (residues 117-122, I1), followed by unwinding of the end of KOW α2* helix (residues 148-155; I2) (Figures 5D and  5E).Notably, this occurs as observed in the opsPEC Enc cryoEM structure (Figures 3D and   S4E) even though the helical content of these regions was restored using homology modelling, and is in agreement with hydrogen-deuterium exchange mass spectrometry data showing that the hairpin tip is more stable than the termini of the helices 54 .Most ID interactions, except for those between NGN 85-100 and KOW 114-126/150-162 from the ends of KOW α1*/α2* helices, persist through I2.
Next, we analyzed the sequence of events within RfaH and between RfaH and opsPEC during the transition from opsPEC Enc to opsPEC Rec .We examined the disruption of native ID contacts and the formation of native contacts for KOW β , contacts between the NGN α3 helix (residues 90-100) and RNAP, and contacts between KOW β and the β'ZBD as a function of time.The first event enabling RfaH refolding is the breakage of 70% of ID interactions (peak at 400 , Figure 5G), in line with the evidence that the NGN-KOW contacts control RfaH metamorphosis 55,56 .Either concurrently or after domain dissociation (1000 ), α3, which harbors a highly conserved I93 residue that stabilizes the ID interface 57 , binds to β'CH (Movies S1 and S2).In these simulations, α3 gets locked in place by forming >75% of its native contacts with RNAP and enabling tighter binding of RfaH to opsPEC.Then, KOW refolds into the β-barrel (peak at 1100 ).This event is not concurrent with binding to DNA, β flap or β'ZBD (Figure 5H and Movies S1 and S2).
Our simulations reveal that the KOW switch is initiated by unwinding of the ends of its αhelices, proceeds via a largely unfolded state that transiently retains some helical elements at the tip of the KOW hairpin, culminating with the formation of KOW β through heterogeneous intermediate states where different β-strands come together.These findings are in agreement with our structural (Figure 3) and biophysical analyses 54,58 and with MD simulations performed in the absence of RNAP 55,59,60 .We conclude that the KOW transformation is independent of its interactions with the opsPEC.

The opsPEC Rec can be arrested
The opsPEC has been identified as a class II, or backtrack-prone, pause 44 and our results show that RfaH further stabilizes the paused state.How does opsPEC Rec resume elongation?
To answer this question, we subjected opsPEC Rec after NTP addition and heating (37 o C) to cryoEM/SPA, yielding a cryoEM reconstruction at 3.0 Å resolution.The resulting complex was nearly identical to opsPEC Rec (Figures 6A and S7A; Table S1) with a major exception: a clear additional density in the secondary channel showed that the transcript had been elongated by at least two nts followed by RNAP backtracking (Figures 6B and 6C).These findings establish that opsPEC Rec is elongation-competent and that, following nucleotide addition, RNAP slides back, generating opsPEC Back .
In the cell, backtracked RNAP can be rescued by Gre factors 49 , Mfd 61 , or the coupled ribosome 62 .We next tested if RNAP escape from ops can be promoted by anti-backtracking factors in vitro.In agreement with our structural data, RfaH strongly delays elongation two nts downstream from the ops pause site: the C13 pause persists for minutes even at 0.2 mM NTPs (Figure 6D).The addition of GreA, which induces the nascent RNA cleavage in backtracked ECs 49 , dramatically shortened the pause.A similar but less dramatic effect was observed in the presence of Mfd, a DNA translocase that pushes RNAP forward 61 .By contrast, the 70S ribosome did not promote escape, an expected result given that the ribosome must exert force on RNAP to assist forward translocation, not just sterically block reverse translocation 62 , and translation initiates only 50+ nts downstream from the ops site.

The KOW domain contributes to pause escape
3DVA of opsPEC Rec revealed that KOW β binding at the β'ZBD was correlated with a movement of the upstream DNA towards the β protrusion and β flap (Movies S3 and S4; Figure S6).The apparent push of KOW β on upstream DNA suggests that it reinforces HL-upstream DNA contacts, which are already established in opsPEC Enc and may either support hyper-pausing by helping drive the proximal end of the upstream duplex into the β' rudder or, alternatively, could counteract backtracking of opsPEC Rec .The removal of KOW potentiates the RfaHinduced delay at C13 (Figure 4E).While a fraction of arrested complexes may escape by releasing RfaH and reformation of the autoinhibited state in full-length RfaH, but not in the isolated NGN, this observation is consistent with the idea that, after RfaH accommodation and refolding, KOW β /HL act as anti-backtracking devices.A similar effect has been observed for SuhB-reinforced NusG-upstream DNA contacts in an rRNA antitermination complex 6 .
3DVA also showed that in some complexes, density for KOW β was anti-correlated with density for the opsHP; i.e., one boundary state exhibited clear density for KOW β bound at the β'ZBD but weak density for the opsHP and downstream ops nts, whereas the other lacked density for KOW β but exhibited very well-defined density for the opsHP and downstream nts (Movies S3 and S4).Thus, in addition to dampening backtracking, KOW β binding at the β'ZBD seems to facilitate pause escape by destabilizing opsHP-NGN interactions.

NusA and KOW may cooperate during ribosome loading
NusA, a general elongation factor that associates with most ECs 3 , modulates RNAP pausing, termination, and antitermination through contacts to the nascent RNA or accessory factors 6,45,[63][64][65] and can aid in coupling transcription to translation 20 .NusA binds to the βFTH 45 and is expected to associate with, and possibly trigger conformational changes in, RfaH-bound ECs.To ascertain that our conclusions would hold in the presence of NusA, we assembled a NusA-modified opsPEC Rec and determined its structure by cryoEM/SPA at 3.2 Å resolution.
The cryoEM reconstruction revealed that, except for added NusA, the ensuing complex is essentially identical to opsPEC Rec , suggesting that NusA binding per se does not induce conformational changes in opsPEC Rec (Figure S7) and does not alter the KOW β presentation.Thus, NusA could provide yet another platform for ribosome engagement through direct contacts to the ribosome 20 .

DISCUSSION
In this work, we present a series of structures that delineate co-transcriptional recruitment and activation of RfaH (Figure 7).We show that opsPEC is an unusual example of the consensus pause in which the ops element, a target of RfaH, forms a short NT-DNA hairpin, opsHP, exposed on the RNAP surface.Folding of the opsHP and its interactions with specific RNAP elements push upstream DNA into the β' zipper and β'CH, thereby supporting RNAP swiveling, and stabilize a paused pre-translocated state with a 10-bp hybrid and extended bubble.The autoinhibited RfaH, in which the RNAP-binding site on NGN is partially masked by KOW α , docks near its final binding site using the opsHP as an anchor to form an encounter complex, in which NGN grasps and twists the opsHP to hyper-stabilize the PEC with an 11bp hybrid.Next, RfaH domains dissociate; NGN takes its final position whereas KOW refolds into the β-barrel.Our structural data and MD simulations show that while binding to the opsPEC triggers the initial domain separation/refolding, the subsequent KOW fold-switch is spontaneous.Following refolding, KOW β may form transient contacts with RNAP awaiting the arrival of the ribosome.Our work portrays a remarkably economical mechanism of deoxyriboregulation of RNAP, in which a merely 12-nt region of NT DNA directs major conformational changes in the transcription machinery that trigger further modulation via a metamorphic accessory factor, ultimately supporting synthesis of vital proteins.

The mechanism of pausing at the ops site
During processive RNA synthesis, RNAP adds nucleotides one by one to the 3'-end of the growing RNA.Following nucleotide addition, the NTP in the active site turns into the RNA 3'end and the 9-bp RNA:DNA hybrid converts into the 10-bp hybrid.RNAP must then translocate by 1 nt, melting 1 bp of the downstream DNA to place the next T-DNA base in the active site, ready to accept the incoming NTP (Figure 7A); the most upstream bp of the hybrid is broken and T-and NT-stands reanneal 66 .A failure to translocate leads to pausing, a key regulatory mechanism used to (i) control the speed of RNA synthesis; (ii) coordinate transcription with other processes such as translation; (iii) recruit regulatory factors; etc. 67 .
Pausing at ops is a prerequisite for RfaH recruitment 11 .The ops element stands out as one of the strongest pauses in E. coli 47 , as could be expected because a failure of RfaH engagement compromises the cell wall integrity 68 , and an exemplar of a consensus pause sequence identified by genome-wide analysis. 47,48The opsPEC structure (Figure 7B) reveals a unique geometry of the nucleic acid scaffold, a canonical 10-bp hybrid but 11-nt long ss NT DNA region instead of 10 in the typical pre-translocated EC 49 .The NT DNA forms the opsHP, which comprises a 3-bp stem and is stabilized by a multitude of positively-charged RNAP residues (Figure 2D); the loop residues G5 and T6 are flipped-out.The two downstream nts, G10 and T11, stack on each other and are further stabilized by βW183, whereas G12 is flipped into the CRE pocket, leaving the C12 T base free to contact βR542 (Figure 2G).Two other features of opsPEC are expected to contribute to pause strength 67 .The TL is trapped in an unfolded conformation by salt bridges between the TL β'R933 and the fork loop βE546/D549 (Figure 2C).The swivel module rotates by 5.8 o (Figure 2B) and is stabilized by opsHP interactions with the β lobe.Recently, it has been suggested that NusG-dependent pausing in Gram-positive bacteria relies on a similar mechanism wherein the NusG-bound NT DNA is placed in a cleft between NusG and the β lobe, blocking RNAP return to the nonswiveled state 69 .Strikingly, through contacts to RNAP, the opsHP is able to lock the swivel module in the absence of additional regulatory inputs.

Sequence-structure relationships in PECs
Comparison of opsPEC with structurally-characterized PECs, hisPEC 45,70 , his-ePEC 71 and con-ePEC 71 , offers insights into the mechanism of pausing and the contributions of the individual sequence determinants.Con-ePEC is paused on a synthetic sequence GG-CAUAGUUG-CG based on the pause consensus ( -12 GG-NNNNNNNN-YG +1 ) 47,48 ; GG and YG motifs are known as upstream and downstream fork junction (UFJ and DFJ) determinants, respectively.The consensus within the 8N region is weak and, in the first approximation, the con-ePEC structure reveals contributions of DFJ and UFJ to pausing.opsPEC (GG-CGGUAGCG-UG) and hisPEC (upstream RNA hairpin followed by CG-AUGUGUGC-UG) form on naturally occurring regulatory pause elements featuring nucleic acid hairpins.In opsPEC, the hairpin forms in the NT DNA and is encoded by the 8N sequence, and perfect DFJ and UFJ are present.In hisPEC, a nascent RNA hairpin forms in the exit channel.hisPEC and his-ePEC, its hairpin-less derivative, have perfect DFJ and a partial UFJ.
The common feature of all PECs is failure to translocate and load the acceptor T-DNA base into the active site.opsPEC and con-ePEC are pre-translocated, his-ePEC equilibrates between pre-and half-translocated states (only RNA translocates), and hisPEC is halftranslocated.The asynchronous (half) translocation leads to tilting of the nucleobases of the RNA:DNA hybrid relative to the helical axis.Since incomplete translocation is characteristic for all pauses, the DFJ and UFJ are likely mainly responsible for the translocation block.
The synthetic con-ePEC and his-ePEC equilibrate between several states with open and closed active sites 71 .In contrast, natural opsPEC and hisPEC are each represented by a single state with an open active site, likely because both are strongly stabilized in a swiveled conformation by nucleic acid hairpins, and swiveling is incompatible with the folded TL 70 .Noteworthy, a fraction of his-ePEC is also swiveled despite lacking the stabilizing hairpin.The causative relationships between hybrid tilting and swiveling were hitherto difficult to delineate.
Considering that opsPEC is swiveled but not tilted suggests that tilting of hisPEC/ePEC is mediated by the 8N sequence.Indeed, AUGUGUGC consists entirely of purine-pyrimidine steps, has diminished intra-strand base stacking, and is expected to favor tilting of nucleobases to enhance the inter-strand stacking of purines.Interestingly, while both opsPEC and hisPEC are swiveled, the swiveling is seemingly attributable not to the UFJ and DFJ determinants, but to pause-specific 8N regions, the opsHP and the tilted hybrid, respectively.
In the latter case, swiveling is further stabilized by the RNA hairpin.
A notable feature of the con-ePEC is an overextended 11-bp RNA:DNA hybrid, which forms only upon active site opening and is not linked to swiveling -all states of con-ePEC are not swiveled.opsPEC also forms an 11-bp hybrid, but only upon RfaH binding.It is tempting to speculate that opsHP inhibits the formation of the overextended hybrid, whereas RfaH binding repositions the opsHP and unleashes the intrinsic potential of opsPEC to form such a state.
Hybrid overextension is a step backward and may conceivably delay translocation, thereby strengthening the pause, but the underlying sequence determinants are difficult to pinpoint unambiguously.One possibility is that the DFJ and UFJ determinants are sufficient to cause hybrid overextension, and the inability of hisPEC/ePEC to form such a state can then be attributed to hybrid tilting, the partial UFJ, or both.Another possibility is that specific determinants within the 8N sequence are the reason for hybrid overextension: con-ePEC and opsPEC, but not hisPEC/ePEC, share G at -2 and C at -9.

The role of ops DNA hairpin in recruitment of RfaH
Our structural analysis reveals that the opsHP forms in the absence of RfaH and recruits the autoinhibited RfaH CC through specific contacts to G5 and T6 residues in the hairpin loop (Figure 3F).The ops pause is a composite element where UFJ and DFJ mediate pausing and nucleobases in the hairpin loop make specific contacts to RfaH 41 .Many potential ops-like hairpins, which differ only in the loop residues and are thus expected to induce pausing, are encoded in the NT strand of MG1655 operons; RfaH is recruited only to ops sites that have a T at position 6 38 , but these comprise fewer than half of ops-like sequences.Do NT-strand hairpins that have A, G or C in the loop recruit, or perhaps exclude, other modulators of transcription elongation?Housekeeping NusGs from Bacillus subtilis 69 and Mycobacterium tuberculosis 72 interact with the NT DNA and B. subtilis NusG displays a preference for Ttracks 73 .E. coli NusG makes no contacts to DNA 12 , but could conceivably be excluded from the EC by NT-DNA structures.By contrast, RfaH orthologs must avoid recruitment to "wrong" sites, and the NT DNA readout provides means for the exquisite selectivity.
Here we show how RfaH, still in the autoinhibited state, sets the stage for the final NGN placement, with three distinct elements making contacts to the opsHP, β'CH and β gate loop.
In the encounter complex (Figure 7C), NGN cannot be fully accommodated between the β lobe, β protrusion, and β'CH, and RNAP swiveling is required to fit the additional bulk of KOW α between these RNAP elements.While anchored at the upstream DNA duplex, NGN stabilizes, twists and repositions the opsHP, leading to melting of one bp in the upstream DNA and inclusion of the melted T-strand nt into the overextended 11-bp hybrid, which counteracts translocation and stabilizes the pause.RfaH CC clamps the NT strand between itself and the β lobe, further hindering return to the non-swiveled state and thus strengthening the pause.
Contacts with opsHP position NGN near the β'CH tip, which starts to wedge between the RfaH domains, initiating displacement of KOW α (Figure 3D).Consequently, RfaH "pulls" itself towards its final binding sites, driving the wedge further and further between KOW and NGN until they dissociate completely.Thus, the opsHP loop has two functions: (i) sequence-specific recognition by RfaH and (ii) anchoring RfaH to initiate its activation.

RfaH accommodation and activation
The opsPEC Rec structure shows that, once unmasked, the NGN helix α3 packs against the β'CH, the principal RfaH-binding site on RNAP.The opsHP:NGN interactions are identical to those in opsPEC Enc , while the region around T73 contacts the βGL, the second RNAP-binding site (Figure 7D).The NGN-RNAP interactions account for the remarkable stability of RfaH-EC contacts throughout transcription and for the anti-pausing activity of RfaH in vitro 11,12,37 but make only a small contribution to its overall effect on gene expression 14 .The KOW binding to ribosome, thought to mediate both the initial ribosome loading and transcription-translation coupling, is critical for RfaH activity and is, in turn, dependent on refolding of the liberated KOW α into a NusG-like β-barrel that creates a contact surface for S10 43 .KOW refolds spontaneously when freed from NGN, as demonstrated by NMR spectroscopy 41,43 , but the fold-switch could be altered in the context of the opsPEC, due to KOW-EC contacts revealed by structures.Our MD simulations reveal that refolding starts from the ends of the α-helical KOW hairpin (Figure 5) consistent with its partial unfolding in the opsPEC Enc .Once the NGN α3 helix is released and gets locked in place, KOW refolds into the five-stranded β-barrel (Movie S2), largely as observed with RfaH in isolation 51 .We conclude that all the information required for KOW transformation is encoded in its primary sequence, and that its contacts to DNA and RNAP are established following the fold-switch.

Pause escape and recruitment of the ribosome
RfaH hyper-stabilizes the ops pause, an effect opposite to its pause-suppressing activity at any other site 40 .We show that opsPEC Rec can extend the RNA but, unable to break NGN-opsHP interactions, backtracks to the original position (Figure 7E).This is reminiscent of PECs stabilized by the initiation σ-factor, which can extend RNA and the bubble, scrunching both DNA strands to maintain contacts to the -10-like NT DNA 74 .The scrunched σ-PECs can escape to elongation, breaking these contacts, or backtrack to the starting position 74 .
It is hypothesized that the energy stored in the scrunched strands is used to break the σ-DNA contacts to overcome the pause 74 .We do not observe scrunching in opsPEC Rec or opsPEC Back , but the transition to opsPEC Back must be accompanied by at least two translocation steps along the DNA and likely requires scrunching, as proposed earlier 75 .The energy stored in the compressed bubble/hybrid and hypothetical scrunched states could be used to break the NGN/NT contacts and melt the opsHP to overcome the pause.
Rather than being an unavoidable consequence of overly stable protein-DNA contacts, pausing mediated by specific recognition of NT DNA may facilitate recruitment of protein partners to the EC.The phage λ Q antiterminator loads onto RNAP at a σ-dependent promoter-proximal pause through direct contacts to σ 76 .We hypothesize that a ribosome is recruited at the ops site through direct contacts to KOW β , and that backtracking extends the time window for this recruitment.By bridging RNAP and ribosome, RfaH will promote ribosome scanning the mRNA for a start codon, where the translation initiation complex assembles, and coupling thereafter (Figure 7F).Failure to load the ribosome abolishes the expression of RfaHcontrolled genes 32 .It remains to be determined which form of the ribosome is recruited by RfaH and when, and whether RNA is required during the initial loading.

Concluding remarks
Collectively, our results show that the information encoded in the ops and RfaH sequences fully controls every step in the RfaH cycle -from RNAP pausing and RfaH loading (ops) to RfaH activation and finally ribosome recruitment (RfaH).The KOW transformation that activates RfaH, initially thought to be unique to that protein, has recently been discovered to be ancient and ubiquitous 77 , arguing that our insights into the mechanisms of recruitment and metamorphosis of RfaH would be applicable to NusG homologs across all life.
Here, we deciphered the molecular details that control the fold-switch in RfaH, providing insights into other fold-switch mechanisms.Recent estimates suggest that ~4 % of structures deposited in the PDB may belong to fold-switching proteins 78 , a figure that almost certainly understates the prevalence and significance of this phenomenon, given the bias for a single, lowest-energy structure in the most commonly used structural methods.It is worth noting that despite its relatively small size (50 residues in the KOW domain and 162 residues overall), RfaH nonetheless undergoes the most drastic fold-switch possible -from all-α to all-β.Having shown here that the sequence of RfaH itself is responsible for this transformation, we have definitively established the suitability of RfaH as an eminently tractable (both experimentally and computationally) model system for investigating the molecular determinants of protein fold-switching.

LIMITATIONS OF THE STUDY
Our current cryo-EM reconstructions capture snapshots of key intermediates in the RfaH activation pathway but provide no information on transitions between these states.For example, we did not observe scrunching, which we presume accompanies nucleotide addition when translocation is hindered by RfaH contacts to opsHP.The encounter complex was obtained with RfaH CC in which the domains are locked by a disulfide bridge; although we have shown that RfaH CC behaves as the unmodified RfaH under reducing conditions, we cannot exclude minor differences in recruitment.The ribosome loading, a critical step in RfaH function, is a black box.Following RfaH recruitment and activation, RNAP translocation, and accompanying structural changes in the EC in real time would be required to investigate this possibility, using single-molecule approaches to account for the system heterogeneity.Finally, in vitro experiments have limitations that can only be overcome by studying the process in vivo, which may be possible due to advances in super-resolution microscopy.

Figure 2 .
Figure 2. Structural basis of transcriptional pausing in the opsPEC.

(
bottom).(G) A summary of nucleic acid changes.(H) Nucleic acids (cartoon/stick) and β' lid loop, β' rudder loop and β'CH (cartoon).C1 T and G1 R form an 11 th bp in opsPEC Enc (cyan).The upstream DNA vectors are indicated by arrows.(I) A close-up view of the upstream fork junction.Selected β'CH side chains are shown as sticks; polar (dashed lines) and stacking (red arrow) interactions are indicated.In panels B, C, D, E, F, H and I, orientations are relative to the standard view (Figure2A).

Figure 4 .
Figure 4. Recruited RfaH is activated and keeps opsPEC in the hyper-paused state (A) Superposition of opsPEC Rec (in colour) and nc-opsPEC Rec (grey).3DVA of opsPEC Rec cryoEM maps revealed two extrema, structure 1 (left) with KOW β bound to the β'ZBD and poorly ordered opsHP DNA (the latter only observed for nc-opsPEC Rec ) and structure 2 (right) exhibiting no cryoEM density of KOW β but well-defined opsHP density.

Figure 6 .
Figure 6.RNAP escape from ops is hindered by backtracking and requires auxiliary